Skip to content

Conversation

@fresh-borzoni
Copy link
Contributor

@fresh-borzoni fresh-borzoni commented Jan 11, 2026

Summary

Adds admin operations and partition support to achieve feature parity with Rust/C++ bindings.

Admin Operations

  • drop_table(table_path, ignore_if_not_exists=False) - Drop a table from the cluster
  • list_offsets(table_path, bucket_ids, offset_type, timestamp=None) - List offsets for non-partitioned tables
  • list_partition_offsets(table_path, partition_name, bucket_ids, offset_type, timestamp=None) - List offsets for partitioned tables
  • create_partition(table_path, partition_spec, ignore_if_exists=False) - Create a partition
  • list_partition_infos(table_path) - List all partitions for a table

LogScanner Low-Level API

Replaces high-level subscribe with low-level methods matching Rust/C++:

  • subscribe(bucket_id, start_offset) - Subscribe to a single bucket
  • subscribe_batch(bucket_offsets) - Subscribe to multiple buckets
  • subscribe_partition(partition_id, bucket_id, start_offset) - Subscribe to partitioned table bucket
  • to_arrow() / to_pandas() now work for both partitioned and non-partitioned tables
  • Returns clear error if called without subscribing first

Closes #148 #244

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds two administrative operations to the Python bindings to achieve feature parity with C++ bindings: drop_table() for deleting tables and list_offsets() for querying bucket offsets. Both methods follow existing async patterns and leverage core APIs that already exist in the Rust codebase.

Changes:

  • Added Admin.drop_table() method with optional ignore_if_not_exists parameter
  • Added Admin.list_offsets() method supporting earliest, latest, and timestamp-based offset queries
  • Introduced OffsetType class with string constants for type-safe offset type specification
  • Updated example.py to demonstrate both new features

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
bindings/python/src/lib.rs Adds OffsetType class definition with string constants and registers it in the Python module
bindings/python/src/admin.rs Implements drop_table() and list_offsets() admin methods with validation and error handling
bindings/python/example/example.py Demonstrates usage of new methods with both string literals and OffsetType constants

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@fresh-borzoni
Copy link
Contributor Author

resolved conflict

@fresh-borzoni
Copy link
Contributor Author

@luoyuxia PTAL 🙏

@fresh-borzoni
Copy link
Contributor Author

I'll add partition support, since I see we added this to cpp bindings

@fresh-borzoni
Copy link
Contributor Author

@luoyuxia I've reworked this PR to support partitioned tables and have the same methods as CPP bindings have atm.
PTAL 🙏

@fresh-borzoni fresh-borzoni changed the title [ISSUE-148] Support drop_table and list_offsets methods in python bin… Support drop_table and partition,offset methods in python bindings Feb 5, 2026
@fresh-borzoni fresh-borzoni changed the title Support drop_table and partition,offset methods in python bindings Support drop_table, partitions and offsets methods in python bindings Feb 5, 2026
@fresh-borzoni
Copy link
Contributor Author

fresh-borzoni commented Feb 5, 2026

I think #246 should be merged first and then I'll rebase one more time.

@luoyuxia
Copy link
Contributor

luoyuxia commented Feb 5, 2026

@fresh-borzoni Sorry for miss it. I'll review this weekend.

@luoyuxia
Copy link
Contributor

luoyuxia commented Feb 5, 2026

@fresh-borzoni Could you please rebase it?

@fresh-borzoni
Copy link
Contributor Author

@luoyuxia rebased and renamed subscribe_batch

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1915 to +1936
let cache = self.partition_name_cache.read().unwrap();
if let Some(map) = cache.as_ref() {
return Ok(map.clone());
}
}

// Fetch partition infos (releases GIL during async call)
let partition_infos: Vec<fcore::metadata::PartitionInfo> = py
.detach(|| {
TOKIO_RUNTIME.block_on(async { self.admin.list_partition_infos(table_path).await })
})
.map_err(|e| FlussError::new_err(format!("Failed to list partition infos: {e}")))?;

// Build and cache the mapping
let map: HashMap<i64, String> = partition_infos
.into_iter()
.map(|info| (info.get_partition_id(), info.get_partition_name()))
.collect();

// Store in cache (write lock)
{
let mut cache = self.partition_name_cache.write().unwrap();
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unwrap() calls on RwLock::read() and RwLock::write() will panic if the lock is poisoned (which happens when a thread panics while holding the lock). While poison is rare in Python bindings due to controlled execution, it's better to handle this gracefully. Consider using expect() with a descriptive message or mapping to a PyErr using map_err() instead of unwrap().

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fresh-borzoni Thanks for the pr. LGTM

@luoyuxia luoyuxia merged commit 6de2cde into apache:main Feb 5, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support drop_table and list_offsets methods in python bindings

2 participants